AITopics | visual programming

Collaborating Authors

visual programming

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Visual Programming for Step-by-Step Text-to-Image Generation and Evaluation

Neural Information Processing SystemsDec-23-2025, 23:45:51 GMT

As large language models have demonstrated impressive performance in many domains, recent works have adopted language models (LMs) as controllers of visual modules for vision-and-language tasks. While existing work focuses on equipping LMs with visual understanding, we propose two novel interpretable/explainable visual programming frameworks for text-to-image (T2I) generation and evaluation. First, we introduce VPGen, an interpretable step-by-step T2I generation framework that decomposes T2I generation into three steps: object/count generation, layout generation, and image generation. We employ an LM to handle the first two steps (object/count generation and layout generation), by finetuning it on text-layout pairs. Our step-by-step T2I generation framework provides stronger spatial control than end-to-end models, the dominant approach for this task.

step-by-step text-to-image generation, step-by-step text-to-image generation and evaluation, visual programming, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.81)

Add feedback

AIAP: A No-Code Workflow Builder for Non-Experts with Natural Language and Multi-Agent Collaboration

An, Hyunjn, Kim, Yongwon, Seo, Wonduk, Park, Joonil, Kang, Daye, Oh, Changhoon, Kim, Dokyun, Lee, Seunghyun

arXiv.org Artificial IntelligenceAug-5-2025

While many tools are available for designing AI, non-experts still face challenges in clearly expressing their intent and managing system complexity. We introduce AIAP, a no-code platform that integrates natural language input with visual workflows. AIAP leverages a coordinated multi-agent system to decompose ambiguous user instructions into modular, actionable steps, hidden from users behind a unified interface. A user study involving 32 participants showed that AIAP's AI-generated suggestions, modular workflows, and automatic identification of data, actions, and context significantly improved participants' ability to develop services intuitively. These findings highlight that natural language-based visual programming significantly reduces barriers and enhances user experience in AI service design.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2508.0247

Country: North America > United States (0.46)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Can We Generate Visual Programs Without Prompting LLMs?

Shlapentokh-Rothman, Michal, Wang, Yu-Xiong, Hoiem, Derek

arXiv.org Artificial IntelligenceDec-11-2024

Visual programming prompts LLMs (large language mod-els) to generate executable code for visual tasks like visual question answering (VQA). Prompt-based methods are difficult to improve while also being unreliable and costly in both time and money. Our goal is to develop an efficient visual programming system without 1) using prompt-based LLMs at inference time and 2) a large set of program and answer annotations. We develop a synthetic data augmentation approach and alternative program generation method based on decoupling programs into higher-level skills called templates and the corresponding arguments. Our results show that with data augmentation, prompt-free smaller LLMs ($\approx$ 1B parameters) are competitive with state-of-the art models with the added benefit of much faster inference

annotation, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.08564

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Visual Programming for Step-by-Step Text-to-Image Generation and Evaluation

Neural Information Processing SystemsOct-9-2024, 21:27:12 GMT

layout generation, step-by-step text-to-image generation and evaluation, visual programming, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.83)
Information Technology > Software > Programming Languages (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Add feedback

Text2VP: Generative AI for Visual Programming and Parametric Modeling

Feng, Guangxi, Yan, Wei

arXiv.org Artificial IntelligenceJun-8-2024

The integration of generative artificial intelligence (AI) into architectural design has witnessed a significant evolution, marked by the recent advancements in AI to generate text, images, and 3D models. However, no models exist for text-to-parametric models that are used in architectural design for generating various design options, including free-form designs, and optimizing the design options. This study creates and investigates an innovative application of generative AI in parametric modeling by leveraging a customized Text-to-Visual Programming (Text2VP) GPT derived from GPT-4. The primary focus is on automating the generation of graph-based visual programming workflows, including parameters and the links among the parameters, through AI-generated scripts, accurately reflecting users' design intentions and allowing the users to change the parameter values interactively. The Text2VP GPT customization process utilizes detailed and complete documentation of the visual programming language components, example-driven few-shot learning, and specific instructional guides. Our testing demonstrates Text2VP's capability to generate working parametric models. The paper also discusses the limitations of Text2VP; for example, more complex parametric model generation introduces higher error rates. This research highlights the potential of generative AI in visual programming and parametric modeling and sets a foundation for future enhancements to handle more sophisticated and intricate modeling tasks effectively. The study aims to allow designers to create and change design models without significant effort in learning a specific programming language such as Grasshopper.

grasshopper component, text2vp, workflow, (11 more...)

arXiv.org Artificial Intelligence

2407.07732

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
Asia > Singapore (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Asia > Japan (0.04)

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Industry: Construction & Engineering (1.00)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Recursive Visual Programming

Ge, Jiaxin, Subramanian, Sanjay, Shi, Baifeng, Herzig, Roei, Darrell, Trevor

arXiv.org Artificial IntelligenceDec-4-2023

Visual Programming (VP) has emerged as a powerful framework for Visual Question Answering (VQA). By generating and executing bespoke code for each question, these methods demonstrate impressive compositional and reasoning capabilities, especially in few-shot and zero-shot scenarios. However, existing VP methods generate all code in a single function, resulting in code that is suboptimal in terms of both accuracy and interpretability. Inspired by human coding practices, we propose Recursive Visual Programming (RVP), which simplifies generated routines, provides more efficient problem solving, and can manage more complex data structures. RVP is inspired by human coding practices and approaches VQA tasks with an iterative recursive code generation approach, allowing decomposition of complicated problems into smaller parts. Notably, RVP is capable of dynamic type assignment, i.e., as the system recursively generates a new piece of code, it autonomously determines the appropriate return type and crafts the requisite code to generate that output. We show RVP's efficacy through extensive experiments on benchmarks including VSR, COVR, GQA, and NextQA, underscoring the value of adopting human-like recursive and modular programming techniques for solving VQA tasks through coding.

command, imagepatch, query, (17 more...)

arXiv.org Artificial Intelligence

2312.02249

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Leisure & Entertainment > Sports (0.53)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

VISAR: A Human-AI Argumentative Writing Assistant with Visual Programming and Rapid Draft Prototyping

Zhang, Zheng, Gao, Jie, Dhaliwal, Ranjodh Singh, Li, Toby Jia-Jun

arXiv.org Artificial IntelligenceJul-27-2023

In argumentative writing, writers must brainstorm hierarchical writing goals, ensure the persuasiveness of their arguments, and revise and organize their plans through drafting. Recent advances in large language models (LLMs) have made interactive text generation through a chat interface (e.g., ChatGPT) possible. However, this approach often neglects implicit writing context and user intent, lacks support for user control and autonomy, and provides limited assistance for sensemaking and revising writing plans. To address these challenges, we introduce VISAR, an AI-enabled writing assistant system designed to help writers brainstorm and revise hierarchical goals within their writing context, organize argument structures through synchronized text editing and visual programming, and enhance persuasiveness with argumentation spark recommendations. VISAR allows users to explore, experiment with, and validate their writing plans using automatic draft prototyping. A controlled lab study confirmed the usability and effectiveness of VISAR in facilitating the argumentative writing planning process.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3586183.3606800

2304.0781

Country:

North America > United States > California > San Francisco County > San Francisco (0.16)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education > Educational Setting (1.00)
Health & Medicine > Therapeutic Area (0.93)
Energy > Renewable (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Visual Programming: Compositional visual reasoning without training

Gupta, Tanmay, Kembhavi, Aniruddha

arXiv.org Artificial IntelligenceNov-18-2022

We present VISPROG, a neuro-symbolic approach to solving complex and compositional visual tasks given natural language instructions. VISPROG avoids the need for any task-specific training. Instead, it uses the in-context learning ability of large language models to generate python-like modular programs, which are then executed to get both the solution and a comprehensive and interpretable rationale. Each line of the generated program may invoke one of several off-the-shelf computer vision models, image processing routines, or python functions to produce intermediate outputs that may be consumed by subsequent parts of the program. We demonstrate the flexibility of VISPROG on 4 diverse tasks - compositional visual question answering, zero-shot reasoning on image pairs, factual knowledge object tagging, and language-guided image editing. We believe neuro-symbolic approaches like VISPROG are an exciting avenue to easily and effectively expand the scope of AI systems to serve the long tail of complex tasks that people may wish to perform.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2211.11559

Country: North America > United States > California (0.04)

Genre: Research Report (0.64)

Industry: Media (0.36)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Machine Learning with Visual Programming

#artificialintelligenceJan-10-2022, 09:50:58 GMT

Machine learning (ML) is a part of artificial intelligence (AI) that teaches the computer to work and make decisions based on historical data. A ML algorithm learns from historical data to generate a predictive model used to forecast the future outcome. Advanced forms of ML models could be applied in AI applications, such as Recommender System, Text Processing and Image Recognition. To work with ML, a data scientist should have a good knowledge of mathematics and statistics, and the ability to process data and interpret the results. To process the data, you have to use specific tools or be able to program.

data science, knime analytic platform, machine learning, (11 more...)

#artificialintelligence

Genre: Instructional Material (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.56)
Information Technology > Data Science > Data Mining > Big Data (0.40)

Add feedback

Democratized image analytics by visual programming through integration of deep models and small-scale machine learning

#artificialintelligenceOct-20-2019, 03:13:33 GMT

Deep learning1 has revolutionized the field of biomedical image analysis. Conventional approaches have used problem-specific algorithms to describe images with manually crafted features, such as cell morphology, count, intensity, and texture. Feature learning with deep convolutional neural networks is implicit, and training the network usually focuses on particular tasks, such as breast cancer detection in mammography2, subcellular protein localization3, or plant disease detection4. Training a deep network usually requires a large number of images, which limits its utility. For example, the classifier for plant disease detection by Mohanty et al.4 was trained on 54,306 images of diseased and healthy plants, and the yeast protein localization model by Kraus et al.3 was inferred from 22,000 annotated images, but not everyone who could benefit from image analysis has so many well-annotated images.

deep model and small-scale machine, democratized image analytic, image analytic, (10 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area > Oncology (0.72)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.44)

Add feedback